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[Title of Document] Abstract 
[Problem] 

The problem is to decrease the cache readout 
latency in the case where the cache tag portion and 
the cache data portion are managed separately in a 
computer system of cache type. 
[Solving Means] 

In memory read processing, a coherent 
controller issues an advanced speculative read 
request for (speculatively) reading data from a 
cache data section in advance to a cache data 
controller, before reading a cache tag from a cache 
tag section and conducting cache hit check. If a 
cache hit has occurred, the cache data controller 
returns the data subjected to speculative reading 
as response data, at the time when the cache data 
controller has received a read request issued by the 
coherent controller. 
[Selected Drawing] Fig. 1 
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[Scope of Claim for a Patent] 
[Claim 1 ] 

A computer system including a CPU, a memory, 
and a cache located in a hierarchy class between said 
CPU and said memory, characterized in that 

said computer system comprises a coherent 
controller for determining whether a request 
supplied from said CPU hits said cache to thereby 
issue a request to said cache or said memory; and 
a cache data controller for controlling reading or 
writing of data registered in said cache, in 
accordance with a request issued by said coherent 
controller. 



responsive to acceptance of a read request from said 
CPU, for conducting hit decision of said cache, 
issuing an advanced speculative read request to said 
cache data controller, and issuing a read request 
to said cache data controller if the hit decision 
is a cache hit, and 



responsive to acceptance of an advanced speculative 
read request issued by said coherent controller, for 
reading data from said cache and holding the data. 



said coherent controller comprises means 



said cache data controller comprises means 
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and means responsive to acceptance of a read request 
from said coherent controller, for sending said held 
speculative read data to said CPU as read response 
data. 

[ C 1 a im 2 ] 

A computer system according to claim 1, 
characterized in that 

said coherent controller comprises means 
for issuing a speculative read data discarding 
request to said cache data controller when a hit 
decision is a cache. miss, and 

said cache data controller comprises means 
responsive to acceptance of a speculative read data 
discarding request issued by said coherent 
controller, for discarding speculative read data of 
a speculative read request corresponding to the 
speculative read data discarding request. 
[Claim 3 ] 

A computer system according to claim 1 or 
2, characterized in that said cache comprises an 
n-way associative cache, and said cache data 
controller accepts an advanced speculative read 
request corresponding to n ways issued by said 
coherent controller, reads out data corresponding 
to n ways from said cache, and hold the data. 
[Detailed Description of. the Invention] 
[0001] 

[Technical Field P e r t i n.e n t to the Invention] 
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The present invention relates to a computer 
system, and in parti, cular to a speculative read 
control scheme of cache data in a computer system 
having a cache .between a CPU and a main memory. 

[0002] 
[Prior Art ] 

In recent years, the performance 
improvement of CPUs is far higher than the 
performance improvement of memories, and their 
performance difference tends to widen increasingly, 
A cache which is faster than the main memory and which 
stores a part of contents of the main memory is used 
to absorb such a performance difference between the 
CPU and the memory and shorten the effective memory 
access time. 

[0003] 

Selection of the cache capacity in a 
computer system largely depends upon the 
configuration of the computer system. In many cases, 
a high p e r f o r m a n c e C P U itself has a cache of a large 
capacity. In the case where a large number of CPUs 
are connected, approximately 2 to 8 CPUs are 
connected by a bus or a switch. In the case where 
more CPUs are connected, they are further connected 
by a bus or a switch. In many cases, a hierarchical 
memory system is thus formed. If such a 
configuration is adopted, then the access latency 
between the CPU and the main memory increases, and 
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it exerts a great influence upon the performance in 
the case where a cache miss has occurred in a CPU. 
Therefore, it is necessary to provide a cache in the 
highest class-having approximately 2 to 8 CPUs 
connected and controlled by a bus or a switch, and 
thereby avoid the performance degradation when cache 
overflow of the CPUs has occurred. For example, in 
J P - A- 9 - 1 2 8 3 4 6 , a cache configuration example of such 
a hierarchical bus system is disclosed. The cache 
capacity at this time needs to be at least the total 
cache capacity of all CPUs connected above it. The 
reason can be explained as follows. When overflow 
has occurred in a CPU cache in the case where the 
above described cache capacity is equal to or less 
than the capacity of the CPU cache, cache overflow 
easily occurs also in classes located below the class, 
resulting in a fear of remarkable degradation of the 
system performance. 
[0004] 

By the way, fast devices such as SRAMs are 
typically used in the cache in order to implement 
the fast access of the cache. In a typical 
configuration, a cache tag (tag address) and cache 
data are stored in the same location in this SRAM. 
When processing a read request, the cache tag and 
the cache data, are simultaneously read out. The 
cache tag is checked with a request address. In the 
case of a hit, the cache data can be used immediately. 
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However, SRAMs are lower than DRAMs used in the main 
memory or the like in degree of integration by at 
least one order. For forming a large capacity cache, 
it .is necessary to use a large number of SRAMs. In 
the case where a large number of SRAMs are used, 
interfaces with a large number of SRAMs must be 
formed. Therefore, the number of pins of an LSI for 
controlling the cache increases, and some of the pins 
cannot be accommodated in one LSI. The cache tag 
portion. is used for cache hit check. The increase 
of time caused by this hit check directly results 
in an increase of memory access latency. Therefore, 
an LSI having an interface with the cache tag portion 
needs to be accommodated in the same LSI as the CPU 
bus. By making, an LSI having an interface with the 
cache data portion different from the LSI including 
the CPU bus and providing the interface with a data 
width nearly equal to the CPU bus width, the pin neck 
of LSIs can be eliminated. 
[0005] 

On the other hand, as a scheme for improving 
the hit factor of the cache, there is a set 
associative scheme. For example, in J P - A- 5 - 2 2 5 0 5 3 , 
there are disclosed a scheme of conducting tag 
comparison of a set associative cache and its speed 
increase. 
[0006] 

In hit check in the set associative scheme. 
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cache tags of a plurality of cache lines of a 
plurality of ways are read out, and hit check is 
conducted simultaneously in a plurality of lines. 
At this time, it remains to be seen which data of 
a plurality of lines is used until the cache hit check 
is completed. In a cache (on chip cache) mounted on 
a CPU, it is typical to adopt such a scheme that cache 
access latency is reduced by conducting readout of 
the cache data simultaneously with readout of the 
cache tag and selecting only necessary data after 
the cache hit check has been completed. 
[0007] 

FIG. 12 shows an example of a configuration 
of such a set associative cache. FIG. 12 shows a 
4-way set associative cache including N entries. 
Each entry includes four ways, a 0th way 1000, a first 
way 1001, a second way 1 0 0 2 , and a third way 1 0 0 3. 
Information contained in the cache, includes STAT 
1 0 0 4 indicating the state (valid or invalid) of the 
cache, a cache tag (address) 1005, and cache data 
1006. In a typically adopted method, a low order 
address of a memory address is used as the entry 
number of the cache, and a high order address is used 
as the cache tag. In an on-chip cache, a cache tag 
1 0 0 5 and cache data 1 0 0 6 are stored together as shown 
in FIG. 12. Therefore, it is possible to read 
simultaneously the cache tag 1 0 0 5 and the cache data 
of each of the ways 1 0 0 0 to 1 0 0 3 of a pertinent entry. 
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and immediately select data by using a. way number 
subjected to cache hit. 
[0008] 

If it is attempted to implement a set 
associative cache having a large capacity, however, 
it is necessary to separate an LSI having an 
interface with the cache data from an LSI having 
interfaces with the CPU bus and the cache tag. In 
this case, the cache tag and the cache data cannot 
be read at the same time. Therefore, the cache tag 
and the cache data are read out separately- If at 
this time the data width between the LSI having the 
interface with the cache tag and the LSI having the 
interface with the cache data is only approximately 
the CPU bus due to a physical restriction, then it 
takes a too long time to read out all cache data of 
a plurality of lines into the LSI of the CPU bus side. 
For example, in the case where the CPU bus width is 
8 bytes and the line size is 32 bytes, it takes 4 
cycles X 4 ways = 16 cycles to transfer lines 
corresponding to"4 ways from the cache data side LSI 
to the cache tag side LSI. This means that it takes 
16 cycles whenever the cache is referred to. As a 
result, the performance is remarkably degraded. 
For preventing this performance degradation, it 
becomes necessary to read out cache data after a 
result of cache hit check is found. However, this 
causes an increase of access latency of the cache. 
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[0009] 

[Problem to be solved by the Invention] 

In the case where a large capacity cache of 
the set associative scheme or the like is provided 
between the CPU and the main memory as described 
above, it becomes necessary to put the cache tag 
portion and the cache data portion in separate LSIs 
and manage them under the restrictions of, for 
example, the number of pins of LSIs, In the case 
where such a configuration is adopted, there is a 
problem that the cache readout latency increases if 
the cache tag is read out and the cache hit check 
is conducted, and thereafter the cache data is read 
out. 

[0010] 

An object of the present invention is to 
realize shortening of the cache data readout' time 
in the case where the cache tag portion and the cache 
data portion are managed in separate LSIs as 
described above, in a computer system having a cache 
such as an n way set associative cache located in 
a class between the CPU and the main memory in 
hierarchy. 
[0011] 

[Means for Solving Problem] 

In order to achieve the above described 
object, in accordance with the present invention, 
an advanced or speculative read request is issued 
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to a controller of the cache data portion before 
conducting the cache hit check. Thus data supplied 
from the cache is read in advance and held in the 
controller. In the case where a cache hit has 
occurred at the time when the cache hit check has 
finished, the data subjected to speculative readout 
is used. As a result, the cache data is read out with 
a shortened cache hit check time. 
[0012] 

[Mode for Carrying Out the Invention] 

Hereafter, an embodiment of the present 
invention will be described by referring to drawing. 
FIG. 1 shows a computer system of an embodiment of 
the present invention. The present system includes 
two CPUs, i.e., CPU{0) 1 and CPU(l) 2, a storage, 
controller (SCU) 4, a cache tag section 5, a cache 
data controller 6, a cache data section 7, a main 
memory 8, and a bus 3 for connecting the CPU(O) 1, 
CPU(l) 2, and the SCU 4. Furthermore, the SCU 4 
includes a bus 16, a memory access request queue 17, 
a write data buffer 18, a read response data buffer 
19, a coherent controller 20, a memory access 
controller 21, -a bus. 22, and buses 23 to 29 for 
connecting them. Here, the number of CPUs (i.e., 
the number of nodes) is two. As a matter of course, 
however, the number of nodes may be two or more, or 
the number may be one. 
[0013] 
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Although not illustrated, it is assumed 
that each of the CPU(O) 1 and CPU(l) 2 has a built-in 
cache. Furthermore, it is assumed that each of the 
cache tag section 5 and the cache data section 7 
includes an SRAM which is a high speed memory device. 
It is assumed that the main memory 8 includes a DRAM 
which is a low speed memory device. Furthermore, it 
is assumed that the cache of the present system 
formed of the cache tag section 5 and the cache da t a 
section 7 is a 4-way set associative cache. 
[0014] 

FIG. 2 shows relations among a request 
address supplied from the CPU(O) 1 and CPU(l) 2, a 
cache tag, and a cache entry number. In the present 
embodiment, it is assumed that the request address 
supplied from the CPU(O) 1 and CPU(l) 2 has 32 bits 
and the number of cache entries is 256 K entries. 
Furthermore, it is assumed that the cache line size 
is 32 bytes. In FIG. 2, numeral 100 denotes the 
request address (ADR <31:0>) output from the CPU(O) 
1 andCPU(l) 2. Since the cache line size is 32 bytes, 
six low-order bits of the ADR 100 indicate an address 
in the cache line. Since the number of cache entries 
is 2 5 6 K, 18 bits of ADR <24:7> become a cache entry 
number 102. ADR <31:25> which is the remaining 
high-order address becomes a cache tag 101. 
[0015] 

FIGS. 3 and 4 show configuration examples 
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of the cache tag section 5 and the cache data section 
7. As shown in FIG. 3, the cache tag section 5 is 
a 256 K entry, 4-way aggregate of cache states 
(STATs) 210 and cache tags 211. Furthermore, as 
shown in FIG.. 4, the cache data section Visa 256 
K entry, 4-way aggregate of cache data 2 2 0. There 
is a one-to-one correspondence between entries and 
ways of the cache tag section 5 and the cache data 
section 7. For example, a cache tag corresponding 
to cache data stored in a block of a 0th entry and 
a 0th way of the cache data section 7 is stored in 
a block of a 0th entry and a 0th way of the cache 
tag section 5. The cache state (STAT) 210 indicates 
whether cache data (cache line) of the pertinent 
block is valid or invalid. 
[0016] 

Returning back to FIG. 1, the cache data 
controller 6 accepts a cache data read/write request 
issued by the SCU 4, and reads/writes cache data 
to/from the cache data section 7. A path 12 is a path 
for sending an access request fed from the SCU 4 to 
the cache data controller 6. A path 13 is a path for 
exchanging data between the SCU 4 and the cache data 
controller 6. A path 30 is a signal line to be used 
by the cache data controller 6 to conduct read/write 
control on the cache data section 7. A path 31 is 
a path for exchanging data between the cache data 
controller 6 and the cache data section 7. In the 
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present embodiment, the number of signals of the 
paths 30 and 31 is large because of the 4-way set 
associative cache. Therefore, it is physically 
impossible to provide pins in the SCU 4 directly for 
the cache data section 7. Accordingly, the cache 
data controller 6 is formed as a chip separate from 
the chip of the SCU 4. As a result, the paths 12 and 
13 are smaller than the paths 30 and 31 in the number 
ofsignals. 



SCU 4 is a queue for buffering memory access requests 
issued by the CPU(O) 1 andCPU(l) 2 and sending a 
memory access request to the coherent controller 20 
if the coherent controller 20 is not busy. The data 
buffer 18 is a buffer for temporarily storing write 
data supplied from the CPU(O) 1 and CPU(l) 2. The 
data buffer 19 is a buffer for temporarily storing 
read response data to be returned to the CPU(O) 1 
and CPU(l) 2. The coherent controller 20 determines 
whether a memory access request issued by the CPU(O) 
1 and CPU(l) 2 conducts a cache hit, and issues an 
access request to the cache data controller 6 and 
the memory access controller 21. The memory access 
controller 21 effects access control of the main 
memory 8 in accordance with the access request issued 
by the coherent controller 20. 



[0017] 



The memory access request 



queue 17 in the 



[0018] 
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The coherent controller 20 resolves the 
request address 100 supplied from the CPU(O) 1 and 
CPU(l) 2 as shown in FIG. 2, reads out cache tags 
211 from each way of a pertinent entry of the cache 
tag section 5 shown in FIG. 3 by using the cache entry 
numbe.r 102, compares them with the cache tag 101 of 
the request address 100, and thereby conducts cache 
hit check. This cache hit check itself is basically 
the same as that of the conventional technique. 
[0019] 

Operation of an embodiment in the computer 
system of FIG. 1 will now be described. In the case 
where data required for execution of an instruction 
is not stored in the built-in cache, the CPU(O) 1 
or CPU(l) 2 issues a memory access request to the 
SCU 4 via the bus 3. In the SOU 4, the memory access 
request is stored in the memory access request queue 
17 via the bus 16. In the case of a write request, 
data is also sent from the CPU(O) 1 or CPU(l) 2. In 
the SCU 4, therefore, write data is stored in the 
data buffer 18 via the bus 16. If the coherent 
controller 20 is not busy, a memory access request 
is sent from the memory access request queue 17 to 
the coherent controller 20. 
[0020] 

By referring to the cache tag section 5, the 
coherent controller 20 determines whether the 
received memory access request hitsthe cache. 
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However, if in the case of a read request, data is 
read out from the data cache 7 via the cache data 
controller based on a result of the cache hit 
decision, the access latency becomes large. Then, 
before conducting a cache hit decision by referring 
to the cache tag section 5, .therefore, the coherent 
controller 20 issues a request for conducting 
advanced or speculative readout to the cache data 
controller 6. Thus, the coherent controller 20 
reads out data which should be read out when a hit 
has occurred, from the cache data section 7 into the 
cache data controller 6 in advance, . When a hit has 
occurred, the coherent controller 20 uses this data 
read in advance. 
[ 0 02 1 ] 

FIG. 5 is a processing flow of an embodiment 
of the coherent controller 20. Hereafter, detailed 
operation of the coherent controller 20 will be 
described by referring to FIG. 5. 
[0022] 

Upon accepting a memory access request from 
the CPU(O) 1 or CPU(l) 2 (step 300), the coherent 
controller 20 determines whether the request is a 
read request (step 301) . If the request is a read 
request, the coherent controller 20 issues an' 
advanced or speculative read request to the cache 
data controller 6 via the paths 25 and 12 (step 3 02) . 
At the .same time, the coherent controller 20 sends 
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a cache entry number of the pertinent read request 
to the cache tag section 5 via the path 23 and a path 
10, and reads out cache tags corresponding to 4 ways 
of the pertinent entry from the cache tag section 
5 via a path 11 and the path 24 (step 303) . The 
coherent controller 20 determines whether the cache 
tags read out from the cache tag section 5 hit the 
cache tag of the read request (step 3 0 4) . When a hit 
has occurred, the coherent controller 20 issues a 
read request to the cache data controller 5 via the 
paths 25 and 12 (step 3 05) . The read request at this 
time includes a way number for which the hit has 
occurred, along with a cache entry number. In the 
case where a cache miss has occurred, the coherent 
controller 20 issues a read request to the memory 
access controller 21 (step 306) , and newly registers 
a cache tag of the pertinent memory access request 
in a desired way of the pertinent entry of the cache 
tag section 5 via the paths 23 and 10 (step 307) . 
The memory access controller 21 accesses the main 
memory 8 via the path 27 and a path 14, and reads 
out data onto a path 15 and the path 28. If response 
data is returned from the main memory 8, the coherent 
controller 20 issues a write request to the cache 
data controller 6 in order to register this response 
data with the cache data section 7, and sends the 
response data to the cache data controller 6 via the 
bus 22 and the paths 26 and 13 as write data (step 
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108) . At the same time, the coherent controller 
stores the response data in the data buffer 19 from 
the bus 22 in order to send the response data to the 
CPU (step 309). 
[0023] 

If the request received from the CPU(O) 1 
or CPU{1) 2 is a write request, then the coherent 
controller 20 reads out cache tags corresponding to 
4 ways of the pertinent entry from .the cache tag 
section 5 in the same way as the case of the read 
request (step 310) , and determines whether there has 
occurred a cache hit (step 3. 11). If there has 
occurred a cache hit, the coherent controller 20 
issues a write request to the cache data controller 
6 via the paths 25 and 12 (step 312) . At the same 
time, the coherent controller 20 sends write data 
to the cache data .controller 6 via the paths 26 and 
13 (step 313), The write request at this time 
includes a way number for which the hit has occurred, 
along with a cache entry number. In the case where 
a cache miss has occurred, the coherent controller 
20 issues a write request to the memory access 
controller 21 (step 314). At the same time, the 
coherent controller 20 sends write data to the main 
memory 8 via the paths 28 and 15 (step 315) . The 
memory access controller 21 accesses the main memory 
8 via the paths 27 and 14, and writes the data into 
the main memory 8. 
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[002 4] 

Especially in the case where a read request 
has been accepted, the coherent controller 20 thus 
has a function of issuing an advanced or speculative 
read request to the cache data controller 6 before 
conducting a cache hit check by using the cache tag 
section 5. In the case of a write request, the 
operation is basically the same as the operation of 
the conventional technique. 
[0025] 

The configuration and operation of the 
cache data controller 6 will now be described'. • With 
reference to FIG. 1, the cache data controller 6 
exchanges data with the cache data section 7 in 
accordance with an advanced or speculative read 
request, a read request,, and a write request supplied 
from the coherent controller 20 via the path 12. 
[0026] 

FIG. 6 is a detailed block diagram of the 
cache data controller 6. The cache data controller 
6 includes a request controller 400, a speculative 
read request buffer 401, an address comparator 
section 402, speculative read data buffers 403 to 
4 0 6 , buses 4 0 7 and 4 0 8 to 411, selectors 412 and 413 
to 416, and paths 417 to 428. 
[0027] 

The request controller 400 decodes a 
request received from the coherent controller 20 via 
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the path 12, determines processing to be conducted 
in the cache data controller '6 on the basis of a kind 
of the accepted request, and controls respective 
components. The speculative read request buffer 
401 is a buffer for holding a speculative read 
request received from the coherent controller 20. 
The speculative read data buffers'403 to 406 are 
buffers for holding data read out from the cache data 
section 7 in accordance with a speculative read 
request. As shown in FIG. 4, the cache data section 
7 of the present embodiment is 4-way set associative. 
Data of the 0th way are stored in the speculative 
read data buffer 403. Data of the first way are 
stored in the speculative read data buffer 404, 
Data of the second way are stored in the speculative 
read data buffer 405. Data of the third way are 
stored in the speculative read data buffer 4 0 6. The 
address comparator section 402 determines whether 
an advanced or speculative read request, a read 
request, or a write request has the same cache entry 
as a request stored in the speculative read request 
buffer 401. 
[0028] 

FIGS. 7 and 8 show configuration examples 
of the speculative read request buffer 401 and the 
speculative read data buffers 4 0 3 to 4 0 6. As shown 
in FIG. 7, the speculative read request buffer 401 
includes a plurality of entries. Each entry 



18 



includes a valid bit (V) 5 0 0 and a cache entry number 
501, The va.lid bit 5 0 0 is a bit indicating that the 
entry is valid or invalid. The cache entry number 
501 is a cache entry number which is the subject of 
a speculative read request stored in the pertinent 
entry. As shown in FIG. 8, each of the speculative 
read data buffers 403 to 406 also includes a 
plurality of entries. Cache data (32 B) 600 read out 
from the cache data section 7 speculatively by a 
speculative read request is stored in each entry. 
[0029] 

There is one-to-one correspondence between 
entries of the speculative read request buffer 401 
and entries of the speculative read data buffers 403 
to 4 0 6 . For example, if it is assumed that a cache 
entry number of a' certain speculative read request 
is stored in the 0th entry of the speculative read 
request buffer 401, cache data corresponding to 4 
ways read out from the cache data section 7 
speculatively by the speculative read request are 
stored in the 0th entry of the speculative read data 
buffers 4 0 3 to 4 0 6. The number m of entries of the 
speculative read request buffer 401 and the 
speculative read data buffers 4 0 3 to 4 0 6 may be an 
arbitrary number. Furthermore, the buffers 401 and 
403 to 406 may be formed as one body. 
[ 0 0 3 0 ] 

FIG. 9 is a processing flow of the request 
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controller 400 in an embodiment. Hereafter, 
detailed operation of the cache data controller 6 
will be described centering around the request 
controller 400 by referring to FIG. 9. 
[0031] 

Upon receiving a request from the coherent 
controller 20 via the paths 12 and 417 (step 7 0 0), 
the request controller 400 first determines whether 
the request is a speculative read request (step 701) . 
If the request is a speculative read request, then 
the request controller 400 determines whether a 
request to the same cache entry is stored in the 
speculative read request buffer 401 beforehand (step 
7 0 2) . To be concrete, the request controller 400 
outputs the cache entry number of the speculative 
read request to the path 419. In addition, the 
request controller 400 reads out cache entry numbers 
of respective entries of the speculative read 
request buffer 401, makes the address comparator 
section compare the cache entry number of the 
speculative read request with the cache entry 
numbers of respective entries, receives results of 
the comparison via the path 420, and thereby 
determines whether the same cache entry as that of 
the speculative read request is stored in the 
speculative read request buffer 401 beforehand. If 
the same cache entry is stored, the newly received 
speculative read request is discarded. If a request 
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to the same" cache entry is not stored in the 
speculative read request buffer 401, then the 
request controller 400 determines whether the 
speculative read request buffer 401 is full {step 
7 0 3) . If the speculative read request buffer 401 is 
not full, then the request controller 400 registers 
a new speculative read request with an empty entry 
inthe speculative read request buffer 401 via the 
path 28 (step 7 0 5) . If the s p e c u 1 a t i v e' r e a d request 
buffer 401 is full, then the request controller 400 
invalidates the oldest entry in the speculative read 
request buffer 401 (step 704), and thereafter 
registers a new request. By the way, such an 
invalidation algorithm is well known as a LRU (Least 
Recentry Used) method. Detailed description 
thereof will be omitted. The registered 
speculative read request is transferred to the cache 
data section 7 via the paths 218 and 30 as a read 
request. Cache data corresponding to 4 ways are 
read out from the pertinent cache entry of the cache 
data section 7 (step 7 0 6) . The cache data are newly 
stored in an entry of the speculative read data 
buffers 403 to 406, corresponding to the entry in 
the speculative read request buffer 401 with which 
the speculative read'request has been registered via 
the path 31, the buses 4 0 8 to 411, and the paths 423 
to 4 2 6 (step 7 0 7) . As a result, in the case where 
the speculative read request buffer 401 is full, new 
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cache data is overwritten and stored in the pertinent 
entry of the speculative read data buffers- 4. 0 3 to 
406, corresponding to the invalid entry in the 
speculative read request buffer 401. 
[0032] 

If the request received from the coherent 
controller 20 is not a speculative read request, but 
a read request (step 708), then the request 
controller 400 checks whether an address (cache 
entry number) of the same cache entry as that of the 
read request is stored in the speculative read 
request buffer 401 beforehand (step 7 0 8) . How to 
check is the same as that in the case of the 
speculative read request. If there is the same 
cache entry, then the request controller 4 0 0 reads 
out data from the pertinent entry of the speculative 
read data buffers 4 0 3 to 4 0 6 , and sends the data to 
the path 13 via the selectors 413 to 416, the selector 
412, and the bus 407 as response data (step 710) . 
In other words, the request controller 400 outputs 
a selection signal of the speculative read request 
buffer side on the path 4 2 2 , and outputs a hit way 
number included in the read request to the path 421 
as a selection signal. As a result, data 
corresponding to 4 ways read out from the pertinent 
entry of the speculative read data buffers 403 to 
4 0 6 are first selected by the selectors 413 to 416. 
Subsequently, data corresponding to the hit way 
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number in the pertinent 4 ways is selected by the 
selector 412, and sent to the path 13 via the bus 
407 as response data. Thereafter, the pertinent 
entry of the speculative read request buffer 401 is 
invalidated (step 711) . 
[0033] 

If there is not an address of the same cache 
entry as that of the read request in the speculative 
read request buffer 401, then the request controller 
4 0 0 transfers the pertinent read request to the cache 
data section 7 via the paths 218 and 30, selects cache 
data corresponding to 4 ways read out from the 
pertinent cache entry of the cache data section 7 
by using the selectors 413 to .4 16 via the buses 408 
to 411, selects data corresponding to the hit way 
number included in the cache data by using the 
selector 412, and sends out the selected data from 
the bus 207 to the path 13 as response data (step 
712) . This case occurs in the case where the data 
read from the cache data section 7 into the 
speculative read data buffers 4 0 3 to 4 0 6 in advance 
by the speculative read request is invalidated by 
a write request (preceding write request) hereafter 
described before a subsequent corresponding read 
request . 
[0034] 

In the case whether the request received 
from the coherent controller 20 via the paths 12 and 
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217 is neither a speculative read request nor a read 
request, i.e., also in the case where the request 
is a write request, .the request controller 400 
determines whether an address to the same cache entry 
is stored in the speculative read request buffer 401 
beforehand (step 713) . If the address is present, 
the request controller 400 invalidates the pertinent 
entry of the speculative read request buffer 401 
(step 714) . Subsequently, the request controller 
400 sends out a write request to the cache data 
section 7 via the paths 218 and 30. At the same time, 
the request controller 400 sends out cache data 
received from the coherent controller 20 via the path 
13 to the path 31 via the bus 2 0 7 , the path 4 2 7 , and 
the buses 208 to 211, and writes the data into a 
specified way number of a specified entry of the 
cache data section 7 (step 715). 
[0035] 

In the case where a request to the same entry 
as the write request received from the coherent 
controller 20 is present in the speculative read 
request buffer 401, the pertinent entry is 
invalidated at the step 714 in FIG. 9. The reason 
why doing so is that otherwise the data in the cache 
data section 7 is rewritten by the write operation 
and noncoincidence with data in the speculative read 
data buffers 403 to 406 occurs. By virtue of the 
invalidation processing of the step 714, rewritten 
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new data is read out from the cache data section 7 
at step 712 in a subsequent read request for the. same 
cache entry. 
[ 0 0 3 6] 

In the case where a read request received 
from the coherent controller 20 is a request to the 
same entry as a request in the speculative read 
request buffer 401, the cache data controller 6 
selectively returns data read in advance and stored 
in the speculative read data buffers 403 to 406, 
instead of data supplied from the cache data section 
1, in the present embodiment as shown in FIG. 9. As 
a result, access latency of the cache data section 
7 can be reduced. If the coherent controller 20 
issues a speculative read request while conducting 
the cache hit check as shown in FIG. 5, therefore, 
it becomes possible to reduce cycles corresponding 
to the cache hit check time from the memory access 
latency . 
[0037] 

FIGS. 10 and 11 s h o w p r o c e s s i n g flows of the 
coherent controller 20 and the request controller 
400 in the cache data controller 6 in another 
embodiment- of the present invention. 
[0038] 

FIG. 10 is the processing flow of the 
coherent controller 20. FIG, 10 is different from 
FIG. 5 in that a step 8 0 0 has been added. In the case 
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where the speculative read request issued to the 
cache data controller 6 at the step 302 results in 
a cache miss, a request (speculative read data 
discarding request) for invalidating the 
speculative read data read in advance by the 
pertinent speculative read request is issued to the 
cache data controller 6 at the step 8 0 0. As a result, 
the cache data controller 6 can invalidate unused 
speculative read data stored in the speculative read 
data buffer 402. Accordingly, effective use of the 
speculative read request buffer 401 and the 
speculative read data buffer 402 becomes possible. 
[0039] 

FIG. 11 is a processing flow of the request 
controller 400 included in the cache data controller 
6. FIG. 11 is different from FIG. 9 in that steps 
9 0 0 and 901 have been added. The steps 9 0 0 and 901 
are a processing flow conducted in the case where 
a speculative re. ad cancellation request has been 
accepted from the coherent controller 20. In other 
words, upon receiving a speculative read data 
discarding request from the coherent controller 20 

(step 900), the request controller 400 invalidates 
an entry in the speculative read request buffer 401 
in which a cache entry number of a speculative read 
request corresponding to the pertinent speculative 
read data discarding request has been registered 

(step 901) . As a result, effective use of the 
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speculative read request buffer 401 and the 
speculative read data buffer 402 becomes possible. 
If each of the buffers 401 and 402 is formed with 
a margin of a certain degree in the number of entries^r 
it becomes possible to eliminate the full state and 
it also becomes possible to make the full control 
itself of the steps 703 and 704 unnecessary. 
[0040] 

In the case where a read request received 
from the coherent controller 20 is a request to the 
same entry as a request in the speculative read 
request buffer 401, the cache data controller 6 reads 
out data from some of the speculative read data 
buffers 403 to 406, instead of data supplied from 
the cache data section 7, in the present embodiment 
as well in the same way as the above described 
embodiment 1. As a result, access latency of the 
cache data section 7 can be reduced. If the coherent 
controller 20 issues a speculative read request 
while conducting the cache hit check, therefore, it 
becomes possible to reduce cycles corresponding to 
the cache hit check time from the memory access 
latency . 
[ 0 0 4 1] 

Heretofore, in the embodiments of the 
present invention, it has been assumed that the cache 
is a 4 way set associative. However, the number of 
ways may be an arbitrary number of at least one. 
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Furthermore, it is a matter of course that the 
present invention is not limited to a set associative 
cache, but the present invention can be widely 
applied to a computer system using such a- cache 
scheme that the cache tag portion and the cache data 
portion are managed in separate LSIs. 
[0042] 

[Effects of the Invention] 

As here.tofore described, it becomes 
possible according to the present invention to 
shorten the cache data readout time in the case where 
the cache tag portion arid the cache data portion are 
managed in separate LSIs in order to implement a 
large-capacity cache. 
[Brief Description of Drawings] 
[ Fig . 1 ] 

FIG, 1 is a block diagram showing. a computer 
system of an embodiment of the present invention. 
[Fig. 2 ] 

FIG. 2 is a diagram showing relations among 
an address supplied from a CPU, a cache tag, and a 
cache entry number, 
[ Fig . 3 ] • 

FIG. 3 is a diagram showing a configuration 
example of a cache tag section. 
[ Fig . 4 ] 

FIG. 4 is a diagram showing a configuration 
example of a cache data section. 
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[ Fig . 5 ] 

FIG- 5 is a processing flow diagram of a 
coherent controller according to the first 
embodiment of the present invention. 
[Fig. 6] 

FIG. 6 is a detailed block diagram of a cache 
data controller. 
[ Fig . 7 ] 

FIG. 7 is a diagr.am showing a configuration 
example of a speculative read request buffer in the 
cache data controller. 
[ Fi g . 8 ] 

FIG. 8 is a diagram showing a speculative 
read data buffer in the cache data controller. 
[ Fig . 9 ] 

•FIG. 9 is a processing flow diagram of a 
request controller in the cache data controller 
according to the first embodiment of the present 
invention . 
[Fig. 10] 

FIG. 10 is a processing flow diagram of a 
coherent controller according to a second embodiment 
of the present invention. 
[Fig-. 11] 

FIG. 11 is a processing flow diagram of a 
request controller in a cache data controller 
according to a second embodiment of the present 
invention. 
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[ Fig . 12] 

FIG. 12 is a diagram showing a configuration 
example of a conventional 4-way set associative 
cache. 

[Description of Reference Numerals] 
1 , 2 CPU 

5 cache tag section 

6 cache data controller 

7 cache data section 

8 main memory 

20 coherent controller 

21 memory access controller 

400 request controller 

401 speculative read request buffer 

402 address comparator. 

403-406 speculative read data buffers 
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